Speech Synthesis in Noisy Environment by Enhancing Strength of Excitation and Formant Prominence
نویسندگان
چکیده
Text-to-speech (TTS) synthesis systems have grown popularity due to their diverse practical usability. While most of the technologies developed aims to meet requirements in laboratory environment, the practical appliance is not limited to a specific environment. This work aims towards improving intelligibility of synthesized speech to make it deployable in realism. Based on the comparison of Lombard speech and speech produced in quiet, strength of excitation is found to play a crucial role in making speech intelligible in noisy situation. A novel method for enhancement of strength of excitation is proposed which makes the synthesized speech more intelligible in practical scenario. Linear-prediction analysis based formant enhancement method is also employed to further improve the intelligibility. The proposed enhancement framework is applied in synthesized speech and evaluated in presence of different types and levels of noise. Subjective evaluation results show that, the proposed method makes the synthesized speech applicable in practical noisy environment..
منابع مشابه
Kalman tracking of linear predictor and harmonic noise models for noisy speech enhancement
This paper presents a speech enhancement method based on the tracking and denoising of the formants of a linear prediction (LP) model of the spectral envelope of speech and the parameters of a harmonic noise model (HNM) of its excitation. The main advantages of tracking and denoising the prominent energy contours of speech are the efficient use of the spectral and temporal structures of success...
متن کاملStatistical Variation Analysis of Formant and Pitch Frequencies in Anger and Happiness Emotional Sentences in Farsi Language
Setup of an emotion recognition or emotional speech recognition system is directly related to how emotion changes the speech features. In this research, the influence of emotion on the anger and happiness was evaluated and the results were compared with the neutral speech. So the pitch frequency and the first three formant frequencies were used. The experimental results showed that there are lo...
متن کاملComparison of formant enhancement methods for HMM-based speech synthesis
Hidden Markov model (HMM) based speech synthesis has a tendency to over-smooth the spectral envelope of speech, which makes the speech sound muffled. One means to compensate for the over-smoothing is to enhance the formants of the spectral model. This paper compares the performance of different formant enhancement methods, and studies the enhancement of the formants prior to HMM training in ord...
متن کاملA Formant Tracking Lp Model for Speech Processing in Car/train Noise
Formant estimation becomes complicated in the presence of correlated background noise such as car and train noise as the spectrum of noise from revolving mechanical sources have their own spectral peaks that affect the number and positions of the observed peaks in noisy speech spectrum. This paper investigates the modeling and estimation of spectral parameters at formants of noisy speech in the...
متن کاملProsodic effects on vowel production: evidence from formant structure
Speakers communicate pragmatic and discourse meaning through the prosodic form assigned to an utterance, and listeners must attend to the acoustic cues to prosodic form to fully recover the speaker’s intended meaning. While much of the research on prosody examines supra-segmental cues such as F0 and temporal patterns, prosody is also known to affect the phonetic properties of segments as well. ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016